Skip to content

feat: batch cold account loads in light client#2341

Open
sergeytimoshin wants to merge 9 commits intomainfrom
sergey/load-accounts-batching
Open

feat: batch cold account loads in light client#2341
sergeytimoshin wants to merge 9 commits intomainfrom
sergey/load-accounts-batching

Conversation

@sergeytimoshin
Copy link
Contributor

@sergeytimoshin sergeytimoshin commented Mar 14, 2026

Summary by CodeRabbit

  • New Features

    • Snapshot-based address batch retrieval and a reported "next insertion" index for address queues.
  • Refactor

    • Stronger runtime validation, clearer error messages, and deterministic, fixed-height batched proof reconstruction.
  • Public API

    • Safer borrow-based inputs, more explicit Result-returning paths, and a new integer-conversion error variant.
  • Tests

    • Updated tests and helpers to match new inputs, batching, and validation.
  • Bug Fixes

    • More robust prover startup/health checks and normalized local server address.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a const-generic AddressBatchSnapshot, replaces get_batch_data with get_batch_snapshot returning Result<Option<...>>, makes proof reconstruction HEIGHT-generic, propagates tree_next_insertion_index through indexer outputs, and tightens error propagation across forester, prover client, and SDK indexer code paths.

Changes

Cohort / File(s) Summary
Forester — batch snapshot API
forester/src/processor/v2/helpers.rs
Adds pub struct AddressBatchSnapshot<const HEIGHT: usize>; replaces get_batch_data() with get_batch_snapshot<const HEIGHT>(...) -> Result<Option<AddressBatchSnapshot<HEIGHT>>>; validates empty batches, derives missing leaves_hashchain, reconstructs low_element_proofs, and removes BatchDataSlice.
Forester — batch processing & workers
forester/src/processor/v2/strategy/address.rs, forester/src/processor/v2/processor.rs, forester/src/processor/v2/proof_worker.rs
Switches callers to use get_batch_snapshot(...) and destructure snapshot fields; delegates to staging_tree.process_batch(...) using snapshot leaves_hashchain; makes proof-worker construction and spawn return crate::Result and uses ? to propagate errors.
Indexer — queue proof APIs & outputs
sdk-libs/client/src/indexer/types/queue.rs, sdk-libs/client/src/indexer/photon_indexer.rs, sdk-libs/program-test/src/indexer/test_indexer.rs
Converts proof APIs to const-generic HEIGHT returning [[u8;32]; HEIGHT], adds reconstruct_proofs()/reconstruct_all_proofs(), node-lookup helpers, validate_proof_height, and propagates tree_next_insertion_index into indexer outputs/tests; tightens proof length validation.
Prover client — inputs, helpers & client
prover/client/src/proof_types/batch_address_append/proof_inputs.rs, prover/client/src/helpers.rs, prover/client/src/errors.rs, prover/client/src/prover.rs, prover/client/src/proof_client.rs, prover/client/src/constants.rs, prover/client/src/proof_types/*/proof_inputs.rs
Changes get_batch_address_append_circuit_inputs to accept borrowed slices and fixed-size per-height proof arrays; makes compute_root_from_merkle_proof return Result and propagates errors; adds ProverClientError::IntegerConversion; centralizes HTTP client builder (build_http_client) and richer health-checks; proof client constructors now return Result.
Prover tests & harness
prover/client/tests/batch_address_append.rs
Adapts tests to reference-based inputs and fixed-size proof arrays, adds helper to collect non-inclusion data and a warmup step, and adjusts hashchain and tree initialization to new snapshot/borrowed input shapes.
SDK client — load_accounts batching & errors
sdk-libs/client/src/interface/load_accounts.rs
Adds LoadAccountsError::MissingPdaCompressedData; groups PDAs by program id (group_pda_specs), introduces fetch_proof_batches/fetch_individual_proofs, and updates start_index handling per-group.
SDK types — CPI / decompression seeds
sdk-libs/sdk-types/src/interface/program/decompression/*.rs
Changes token_seeds from Vec<Vec<u8>> to Vec<Vec<Vec<u8>>> and updates signer-seed construction and CPI invocation; documents queue_pubkey_index behavior for CPI in pda.rs.
Misc — small API/constant tweaks
prover/client/src/constants.rs, prover/client/src/prover.rs, prover/client/src/proof_client.rs, other small files
SERVER_ADDRESS changed to http://127.0.0.1:3001; HTTP client built via build_http_client() (.no_proxy()), enhanced health checks, new error variant, and several functions updated to return Result for better propagation.

Sequence Diagram(s)

sequenceDiagram
    participant Stream as StreamingQueue
    participant Forester as Forester::ProcessorV2
    participant Staging as StagingTree
    participant Prover as ProverClient

    Stream->>Forester: get_batch_snapshot<const HEIGHT>(start,end,hashchain_idx)
    Stream-->>Forester: AddressBatchSnapshot{addresses, low_values, indices, proofs, leaves_hashchain}
    Forester->>Staging: staging_tree.process_batch(snapshot, metadata)
    Staging-->>Forester: ProcessResult{proof_input, new_root}
    Forester->>Prover: prepare circuit inputs (borrowed slices & fixed-size proofs)
    Prover-->>Forester: circuit inputs / proof result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

ai-review

Suggested reviewers

  • SwenSchaeferjohann
  • ananas-block

Poem

🌲 Snapshots rise where heights align,
Borrowed slices pass the proofing line.
Streams hand off, staging makes the claim,
Errors surfaced, indices renamed. 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: batch cold account loads in light client' accurately reflects the main objective of the PR, which reorganizes PDA loading with batched fetch strategies and groups PDA specs by program ID.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 70.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sergey/load-accounts-batching
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
sdk-libs/program-test/src/indexer/test_indexer.rs (1)

2498-2534: ⚠️ Potential issue | 🔴 Critical

Non-2xx prover responses can hang this loop indefinitely.

When send() succeeds but the prover returns a 4xx/5xx, the loop neither decrements retries nor returns an error, so it can spin forever. The same block also still unwraps text() and deserialize_gnark_proof_json(), which reintroduces panics on malformed responses. Please route both status failures and decode failures through the same IndexerError path.

💡 Proposed fix
             let mut retries = 3;
             while retries > 0 {
                 let response_result = client
                     .post(format!("{}{}", SERVER_ADDRESS, PROVE_PATH))
                     .header("Content-Type", "text/plain; charset=utf-8")
                     .body(json_payload.clone())
                     .send()
                     .await;
-                if let Ok(response_result) = response_result {
-                    if response_result.status().is_success() {
-                        let body = response_result.text().await.unwrap();
-                        let proof_json = deserialize_gnark_proof_json(&body).unwrap();
+                match response_result {
+                    Ok(response_result) if response_result.status().is_success() => {
+                        let body = response_result
+                            .text()
+                            .await
+                            .map_err(|e| IndexerError::CustomError(e.to_string()))?;
+                        let proof_json = deserialize_gnark_proof_json(&body)
+                            .map_err(|e| IndexerError::CustomError(e.to_string()))?;
                         let (proof_a, proof_b, proof_c) = proof_from_json_struct(proof_json)
                             .map_err(|e| IndexerError::CustomError(e.to_string()))?;
                         let (proof_a, proof_b, proof_c) =
                             compress_proof(&proof_a, &proof_b, &proof_c)
                                 .map_err(|e| IndexerError::CustomError(e.to_string()))?;
                         return Ok(ValidityProofWithContext {
                             accounts: account_proof_inputs,
                             addresses: address_proof_inputs,
                             proof: CompressedProof {
                                 a: proof_a,
                                 b: proof_b,
                                 c: proof_c,
                             }
                             .into(),
                         });
                     }
-                } else {
-                    println!("Error: {:#?}", response_result);
-                    tokio::time::sleep(Duration::from_secs(5)).await;
-                    retries -= 1;
+                    Ok(response_result) => {
+                        let status = response_result.status();
+                        let body = response_result.text().await.unwrap_or_default();
+                        retries -= 1;
+                        if retries == 0 {
+                            return Err(IndexerError::CustomError(format!(
+                                "Prover returned {}: {}",
+                                status, body
+                            )));
+                        }
+                    }
+                    Err(err) => {
+                        retries -= 1;
+                        if retries == 0 {
+                            return Err(IndexerError::CustomError(err.to_string()));
+                        }
+                    }
                 }
+                tokio::time::sleep(Duration::from_secs(5)).await;
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs` around lines 2498 - 2534,
The loop currently only decrements retries on transport errors and unwraps
response body and deserialization, causing hangs or panics for non-2xx or
malformed responses; change the logic in the while retries loop that starts with
`let response_result = client.post(...).send().await` to treat any non-success
status as a failure path that decrements `retries`, sleeps, and eventually
returns an `IndexerError::CustomError` if exhausted. Specifically: when
`response_result` is Ok but `response_result.status().is_success()` is false,
read the body safely (await and map errors) and convert the status+body into an
`IndexerError::CustomError`, decrement `retries` and sleep before retrying;
replace all `.unwrap()` calls on `response_result.text().await` and
`deserialize_gnark_proof_json(&body)` with error-mapped returns into
`IndexerError::CustomError`; keep the successful branch that calls
`proof_from_json_struct` and `compress_proof` but also map their errors (they
already use map_err) and ensure any failure there returns
`Err(IndexerError::CustomError(...))` rather than panicking. This ensures
`retries` is decremented for both transport and non-2xx responses and avoids
unwrap-induced panics while preserving use of `deserialize_gnark_proof_json`,
`proof_from_json_struct`, `compress_proof`, `ValidityProofWithContext`, and
`CompressedProof`.
prover/client/src/proof_types/batch_address_append/proof_inputs.rs (1)

556-559: 🧹 Nitpick | 🔵 Trivial

Redundant slice: [0..] is unnecessary.

new_element_values[0..] is equivalent to just new_element_values. This appears to be a leftover from a refactor.

✨ Suggested simplification
-        new_element_values: new_element_values[0..]
-            .iter()
+        new_element_values: new_element_values
+            .iter()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs` around
lines 556 - 559, The slice new_element_values[0..] is redundant; update the
mapping to iterate directly over new_element_values (i.e., replace
new_element_values[0..].iter() with new_element_values.iter()) so the BigUint
conversion via BigUint::from_bytes_be(v) and the collect() behavior remain
unchanged; touch the expression in proof_inputs.rs where new_element_values is
used in the map to remove the unnecessary slice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 246-257: Add an IntegerConversion variant to the ProverClientError
enum so the try_into() error mappings in the batch address append code compile;
specifically add a variant like IntegerConversion(String) with the
#[error("Integer conversion error: {0}")] attribute to the ProverClientError
enum in errors.rs so calls that return ProverClientError::IntegerConversion from
the low_element_index / low_element_next_index conversion sites
(low_element_indices, low_element_next_indices) are valid.

In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 70-73: The const-generic height is unchecked and can produce
truncated or missing proofs; update the proof-producing functions (e.g.,
reconstruct_proof and the other proof-returning methods in this file) to
validate the supplied const HEIGHT against the fixed address-tree height (40)
before constructing/returning a proof and return an appropriate IndexerError if
HEIGHT != 40 (alternatively persist the queue height on the struct and compare
it), ensuring any code paths that previously assumed a different HEIGHT fail
early with a clear error.

---

Outside diff comments:
In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 556-559: The slice new_element_values[0..] is redundant; update
the mapping to iterate directly over new_element_values (i.e., replace
new_element_values[0..].iter() with new_element_values.iter()) so the BigUint
conversion via BigUint::from_bytes_be(v) and the collect() behavior remain
unchanged; touch the expression in proof_inputs.rs where new_element_values is
used in the map to remove the unnecessary slice.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 2498-2534: The loop currently only decrements retries on transport
errors and unwraps response body and deserialization, causing hangs or panics
for non-2xx or malformed responses; change the logic in the while retries loop
that starts with `let response_result = client.post(...).send().await` to treat
any non-success status as a failure path that decrements `retries`, sleeps, and
eventually returns an `IndexerError::CustomError` if exhausted. Specifically:
when `response_result` is Ok but `response_result.status().is_success()` is
false, read the body safely (await and map errors) and convert the status+body
into an `IndexerError::CustomError`, decrement `retries` and sleep before
retrying; replace all `.unwrap()` calls on `response_result.text().await` and
`deserialize_gnark_proof_json(&body)` with error-mapped returns into
`IndexerError::CustomError`; keep the successful branch that calls
`proof_from_json_struct` and `compress_proof` but also map their errors (they
already use map_err) and ensure any failure there returns
`Err(IndexerError::CustomError(...))` rather than panicking. This ensures
`retries` is decremented for both transport and non-2xx responses and avoids
unwrap-induced panics while preserving use of `deserialize_gnark_proof_json`,
`proof_from_json_struct`, `compress_proof`, `ValidityProofWithContext`, and
`CompressedProof`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9c2af6fd-b65c-41f4-8efc-056487d57845

📥 Commits

Reviewing files that changed from the base of the PR and between 2644529 and 67a29a4.

⛔ Files ignored due to path filters (5)
  • forester-utils/src/address_staging_tree.rs is excluded by none and included by none
  • program-tests/utils/src/e2e_test_env.rs is excluded by none and included by none
  • program-tests/utils/src/test_batch_forester.rs is excluded by none and included by none
  • sparse-merkle-tree/src/indexed_changelog.rs is excluded by none and included by none
  • sparse-merkle-tree/tests/indexed_changelog.rs is excluded by none and included by none
📒 Files selected for processing (8)
  • forester/src/processor/v2/helpers.rs
  • forester/src/processor/v2/strategy/address.rs
  • prover/client/src/proof_types/batch_address_append/proof_inputs.rs
  • prover/client/tests/batch_address_append.rs
  • sdk-libs/client/src/indexer/photon_indexer.rs
  • sdk-libs/client/src/indexer/types/queue.rs
  • sdk-libs/client/src/interface/load_accounts.rs
  • sdk-libs/program-test/src/indexer/test_indexer.rs

@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch 2 times, most recently from 238cb7f to 226df79 Compare March 14, 2026 18:43
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
prover/client/src/helpers.rs (1)

49-54: ⚠️ Potential issue | 🟠 Major

Widen path_index before address trees cross u32::MAX.

compute_root_from_merkle_proof is now the shared fallible helper for both state and address trees, but it still encodes the path index as u32. The address-append callers already hold usize/u64 indices for 40-level trees and currently downcast into this API; once next_index exceeds 4_294_967_295, those casts will silently truncate the upper bits and compute the wrong root and changelog. Use usize or u64 here and narrow only at a boundary that truly requires it.

Based on learnings: Address trees must initialize with ADDRESS_TREE_INIT_ROOT_40 (hardcoded for height 40) and start with next_index = 1.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/helpers.rs` around lines 49 - 54, The function
compute_root_from_merkle_proof currently accepts path_index: u32 which causes
truncation for address trees deeper than u32::MAX; change the parameter to a
wider integer (prefer usize or u64) in the signature and all internal uses
(e.g., ChangelogEntry::default_with_index call and any arithmetic) and only
downcast to u32 at the true boundary if an API you call requires u32; also
update any callers that downcast from usize/u64 to pass the wider type.
Additionally ensure address-tree initialization uses ADDRESS_TREE_INIT_ROOT_40
and that the address-tree next_index starts at 1 (update
constructors/initializers that set next_index) so address trees for height 40
behave correctly.
prover/client/src/proof_types/batch_append/proof_inputs.rs (1)

180-194: ⚠️ Potential issue | 🟠 Major

Remove the remaining proof-length panic in this path.

The new ? only covers compute_root_from_merkle_proof; merkle_proof.try_into().unwrap() still aborts the process if the indexer returns a proof with the wrong height. This function already returns ProverClientError, so surface the mismatch instead of panicking.

Suggested fix
-        let merkle_proof_array = merkle_proof.try_into().unwrap();
+        let proof_len = merkle_proof.len();
+        let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| {
+            ProverClientError::GenericError(format!(
+                "Invalid merkle proof length: got {}, expected {}",
+                proof_len, HEIGHT
+            ))
+        })?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_append/proof_inputs.rs` around lines 180
- 194, The code still panics on merkle_proof.try_into().unwrap(); replace the
unwrap with fallible conversion that returns a ProverClientError instead of
aborting: call merkle_proof.try_into().map_err(|e|
ProverClientError::InvalidProofLength { source: e.into(), context:
format!("expected height for index {}", start_index + i as u32) })? (or perform
a match and return an appropriate ProverClientError), then continue using
merkle_proof_array with compute_root_from_merkle_proof(final_leaf,
&merkle_proof_array, start_index + i as u32)? so the function surfaces
proof-length mismatches via its existing ProverClientError return type rather
than panicking on unwrap.
prover/client/src/proof_types/batch_update/proof_inputs.rs (1)

171-178: ⚠️ Potential issue | 🟠 Major

Finish the fallible-path cleanup in get_batch_update_inputs.

This call site now propagates compute_root_from_merkle_proof, but the proof conversion and Poseidon::hashv immediately above still unwrap(). A malformed proof or invalid hash input will still panic the process instead of returning ProverClientError.

Suggested fix
-        let merkle_proof_array = merkle_proof.try_into().unwrap();
+        let proof_len = merkle_proof.len();
+        let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| {
+            ProverClientError::GenericError(format!(
+                "Invalid merkle proof length: got {}, expected {}",
+                proof_len, HEIGHT
+            ))
+        })?;
@@
-        let nullifier = Poseidon::hashv(&[leaf, &index_bytes, &tx_hashes[i]]).unwrap();
+        let nullifier = Poseidon::hashv(&[leaf, &index_bytes, &tx_hashes[i]])?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_update/proof_inputs.rs` around lines 171
- 178, The code in get_batch_update_inputs currently uses
merkle_proof.try_into().unwrap() and Poseidon::hashv(...).unwrap(), which can
panic; change both to propagate errors as ProverClientError instead: replace
merkle_proof.try_into().unwrap() with merkle_proof.try_into().map_err(|e|
ProverClientError::from(...))? (or an appropriate ProverClientError variant) and
replace Poseidon::hashv(...).unwrap() with Poseidon::hashv(...).map_err(|e|
ProverClientError::from(...))? so that malformed proofs or hash failures return
an Err from get_batch_update_inputs and remain consistent with the
compute_root_from_merkle_proof error propagation.
prover/client/src/proof_types/batch_address_append/proof_inputs.rs (1)

369-489: ⚠️ Potential issue | 🟠 Major

Keep the staging mutations transactional on Err.

This function now returns recoverable errors after it has already pushed into changelog / indexed_changelog and mutated sparse_merkle_tree (for example the later compute_root_from_merkle_proof(...) ? calls and the final ordering check). If any of those fail, the caller gets Err but the local staging state is left partially advanced, so the next retry or batch runs against corrupted state. Build into temporary state and commit only once the whole batch succeeds, or add explicit rollback.

Also applies to: 500-518

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs` around
lines 369 - 489, The code mutates staging state (changelog, indexed_changelog,
sparse_merkle_tree, and pushes via patcher.push_changelog_entry::<HEIGHT>,
sparse_merkle_tree.append, and indexed_changelog.push) before fallible
operations complete, so on Err the local state is partially advanced; fix by
performing all fallible computations on temporary/clone state (clone
sparse_merkle_tree, collect changelog entries and merkle proof arrays into local
Vecs like new_element_circuit_merkle_proofs_tmp / indexed_changelog_tmp) and
only call patcher.push_changelog_entry::<HEIGHT>, sparse_merkle_tree.append (or
replace the real tree), and indexed_changelog.push once the entire batch loop
finishes successfully; alternatively implement an explicit rollback path that
reverts mutations (undo append and pop changelog) on any early Err — apply same
change for the other block around lines 500-518 where similar pushes occur.
sdk-libs/program-test/src/indexer/test_indexer.rs (1)

2500-2514: ⚠️ Potential issue | 🔴 Critical

Non-2xx prover responses hot-loop forever here.

When send() succeeds but the prover returns 4xx/5xx, retries is never decremented and the loop immediately issues another request. That turns a transient server error into an unbounded tight retry, and the adjacent unwrap() calls on text() / deserialize_gnark_proof_json() still panic on malformed responses instead of surfacing IndexerError.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs` around lines 2500 - 2514,
The loop currently treats any Ok(response) the same and only decrements retries
on send() errors, causing tight infinite retries for 4xx/5xx and panics from
unwrap() on malformed bodies; update the block that handles Ok(response_result)
so that non-success HTTP statuses decrement retries (and optionally sleep)
instead of immediately retrying, and convert the body/text and JSON parsing
unwraps into fallible operations that map errors into IndexerError (e.g. replace
response_result.text().await.unwrap() and
deserialize_gnark_proof_json(...).unwrap() with ?-style error mapping to
IndexerError), and ensure that any failure in proof_from_json_struct or
compress_proof is returned as Err(IndexerError::CustomError(...)) rather than
panicking; reference the response handling around deserialize_gnark_proof_json,
proof_from_json_struct, compress_proof and IndexerError to locate the changes.
forester/src/processor/v2/strategy/address.rs (1)

269-308: ⚠️ Potential issue | 🟠 Major

Require a full ZKP batch before calling process_batch.

StreamingAddressQueue::wait_for_batch() can return early after its timeout, so get_batch_snapshot() may hand back fewer than zkp_batch_size_usize addresses. Feeding addresses.len() downstream as zkp_batch_size_actual turns that short snapshot into a real address-append job instead of treating it as "not ready yet", which means a slow indexer can produce inputs for an unsupported circuit size.

Suggested guard
-        let zkp_batch_size_actual = addresses.len();
+        if addresses.len() != zkp_batch_size_usize {
+            return Ok(None);
+        }
         let result = staging_tree
             .process_batch(
                 &addresses,
                 &low_element_values,
                 &low_element_next_values,
                 &low_element_indices,
                 &low_element_next_indices,
                 &low_element_proofs,
                 leaves_hashchain,
-                zkp_batch_size_actual,
+                zkp_batch_size_usize,
                 epoch,
                 tree,
             )

Based on learnings: ZKP batch sizes for address trees must be 10 or 250 in src/initialize_address_tree.rs. Only these circuit sizes are supported for address trees.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@forester/src/processor/v2/strategy/address.rs` around lines 269 - 308, The
snapshot returned by StreamingAddressQueue::get_batch_snapshot may be shorter
than the ZKP batch size; before calling staging_tree.process_batch, verify that
addresses.len() equals zkp_batch_size_usize (the expected batch size) and, if it
is smaller, treat the batch as "not ready" (do not call process_batch) — e.g.,
return/propagate a no-op/NotReady result or retry later. Update the logic around
AddressBatchSnapshot retrieval (get_batch_snapshot::<{
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>) to check zkp_batch_size_actual
against zkp_batch_size_usize and only pass a full batch into
staging_tree.process_batch; ensure the code path preserves the existing error
mapping via map_address_staging_error when a full batch is processed.
♻️ Duplicate comments (1)
sdk-libs/client/src/indexer/types/queue.rs (1)

70-73: ⚠️ Potential issue | 🟠 Major

Reject non-40 proof heights before reconstruction.

These methods now accept any const HEIGHT, but address-queue proofs are only valid at height 40. HEIGHT < 40 silently truncates the witness, HEIGHT > 40 degrades into a late MissingResult, and encode_node_index() only reserves 8 bits for level, so oversized values can alias node ids. Add one shared height guard before any proof reconstruction path and fail early with a clear IndexerError.

Based on learnings: "Tree height must be fixed at 40 for address trees in src/initialize_address_tree.rs. Validate that height equals 40."

Also applies to: 125-143, 155-205, 209-210

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/client/src/indexer/types/queue.rs` around lines 70 - 73, Reject any
const HEIGHT != 40 up-front in all proof-reconstruction entry points (e.g.,
reconstruct_proof and the other methods referenced) by adding a single shared
guard that checks HEIGHT == 40 and returns a clear IndexerError when it is not;
update any call sites that route into encode_node_index or witness/truncation
logic so they never run for non-40 heights, and ensure the error message/variant
clearly states expected=40 and the provided height to make debugging explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 146-152: build_node_lookup currently maps node ids to the last
index because collect() overwrites duplicates, which breaks first-match
semantics used by reconstruct_proof (which uses position()); update
build_node_lookup to either detect and reject duplicate node ids or preserve the
first index by using HashMap::entry().or_insert(...) so the first occurrence
remains in the map (refer to function build_node_lookup and callers like
reconstruct_proof to ensure consistent behavior).

In `@sdk-libs/client/src/interface/load_accounts.rs`:
- Around line 213-216: The debug-only check for max_per_group (> 0) is
inconsistent with the runtime branch that returns an empty Vec when
max_per_group == 0; make the behavior consistent by failing in all builds:
replace the debug_assert!(max_per_group > 0, ...) with a plain
assert!(max_per_group > 0, ...) (or otherwise return an Err if this function is
fallible) and remove the special-case path that returns Vec::new() when
max_per_group == 0 so that invalid zero input always triggers the same failure;
ensure references to max_per_group and specs remain and remove the silent
Vec::new() return for the zero case.

---

Outside diff comments:
In `@forester/src/processor/v2/strategy/address.rs`:
- Around line 269-308: The snapshot returned by
StreamingAddressQueue::get_batch_snapshot may be shorter than the ZKP batch
size; before calling staging_tree.process_batch, verify that addresses.len()
equals zkp_batch_size_usize (the expected batch size) and, if it is smaller,
treat the batch as "not ready" (do not call process_batch) — e.g.,
return/propagate a no-op/NotReady result or retry later. Update the logic around
AddressBatchSnapshot retrieval (get_batch_snapshot::<{
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>) to check zkp_batch_size_actual
against zkp_batch_size_usize and only pass a full batch into
staging_tree.process_batch; ensure the code path preserves the existing error
mapping via map_address_staging_error when a full batch is processed.

In `@prover/client/src/helpers.rs`:
- Around line 49-54: The function compute_root_from_merkle_proof currently
accepts path_index: u32 which causes truncation for address trees deeper than
u32::MAX; change the parameter to a wider integer (prefer usize or u64) in the
signature and all internal uses (e.g., ChangelogEntry::default_with_index call
and any arithmetic) and only downcast to u32 at the true boundary if an API you
call requires u32; also update any callers that downcast from usize/u64 to pass
the wider type. Additionally ensure address-tree initialization uses
ADDRESS_TREE_INIT_ROOT_40 and that the address-tree next_index starts at 1
(update constructors/initializers that set next_index) so address trees for
height 40 behave correctly.

In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 369-489: The code mutates staging state (changelog,
indexed_changelog, sparse_merkle_tree, and pushes via
patcher.push_changelog_entry::<HEIGHT>, sparse_merkle_tree.append, and
indexed_changelog.push) before fallible operations complete, so on Err the local
state is partially advanced; fix by performing all fallible computations on
temporary/clone state (clone sparse_merkle_tree, collect changelog entries and
merkle proof arrays into local Vecs like new_element_circuit_merkle_proofs_tmp /
indexed_changelog_tmp) and only call patcher.push_changelog_entry::<HEIGHT>,
sparse_merkle_tree.append (or replace the real tree), and indexed_changelog.push
once the entire batch loop finishes successfully; alternatively implement an
explicit rollback path that reverts mutations (undo append and pop changelog) on
any early Err — apply same change for the other block around lines 500-518 where
similar pushes occur.

In `@prover/client/src/proof_types/batch_append/proof_inputs.rs`:
- Around line 180-194: The code still panics on
merkle_proof.try_into().unwrap(); replace the unwrap with fallible conversion
that returns a ProverClientError instead of aborting: call
merkle_proof.try_into().map_err(|e| ProverClientError::InvalidProofLength {
source: e.into(), context: format!("expected height for index {}", start_index +
i as u32) })? (or perform a match and return an appropriate ProverClientError),
then continue using merkle_proof_array with
compute_root_from_merkle_proof(final_leaf, &merkle_proof_array, start_index + i
as u32)? so the function surfaces proof-length mismatches via its existing
ProverClientError return type rather than panicking on unwrap.

In `@prover/client/src/proof_types/batch_update/proof_inputs.rs`:
- Around line 171-178: The code in get_batch_update_inputs currently uses
merkle_proof.try_into().unwrap() and Poseidon::hashv(...).unwrap(), which can
panic; change both to propagate errors as ProverClientError instead: replace
merkle_proof.try_into().unwrap() with merkle_proof.try_into().map_err(|e|
ProverClientError::from(...))? (or an appropriate ProverClientError variant) and
replace Poseidon::hashv(...).unwrap() with Poseidon::hashv(...).map_err(|e|
ProverClientError::from(...))? so that malformed proofs or hash failures return
an Err from get_batch_update_inputs and remain consistent with the
compute_root_from_merkle_proof error propagation.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 2500-2514: The loop currently treats any Ok(response) the same and
only decrements retries on send() errors, causing tight infinite retries for
4xx/5xx and panics from unwrap() on malformed bodies; update the block that
handles Ok(response_result) so that non-success HTTP statuses decrement retries
(and optionally sleep) instead of immediately retrying, and convert the
body/text and JSON parsing unwraps into fallible operations that map errors into
IndexerError (e.g. replace response_result.text().await.unwrap() and
deserialize_gnark_proof_json(...).unwrap() with ?-style error mapping to
IndexerError), and ensure that any failure in proof_from_json_struct or
compress_proof is returned as Err(IndexerError::CustomError(...)) rather than
panicking; reference the response handling around deserialize_gnark_proof_json,
proof_from_json_struct, compress_proof and IndexerError to locate the changes.

---

Duplicate comments:
In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 70-73: Reject any const HEIGHT != 40 up-front in all
proof-reconstruction entry points (e.g., reconstruct_proof and the other methods
referenced) by adding a single shared guard that checks HEIGHT == 40 and returns
a clear IndexerError when it is not; update any call sites that route into
encode_node_index or witness/truncation logic so they never run for non-40
heights, and ensure the error message/variant clearly states expected=40 and the
provided height to make debugging explicit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f30f9f95-bb7c-4dcf-b433-899ee6f79ada

📥 Commits

Reviewing files that changed from the base of the PR and between 67a29a4 and 238cb7f.

⛔ Files ignored due to path filters (5)
  • forester-utils/src/address_staging_tree.rs is excluded by none and included by none
  • program-tests/utils/src/e2e_test_env.rs is excluded by none and included by none
  • program-tests/utils/src/test_batch_forester.rs is excluded by none and included by none
  • sparse-merkle-tree/src/indexed_changelog.rs is excluded by none and included by none
  • sparse-merkle-tree/tests/indexed_changelog.rs is excluded by none and included by none
📒 Files selected for processing (12)
  • forester/src/processor/v2/helpers.rs
  • forester/src/processor/v2/strategy/address.rs
  • prover/client/src/errors.rs
  • prover/client/src/helpers.rs
  • prover/client/src/proof_types/batch_address_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_update/proof_inputs.rs
  • prover/client/tests/batch_address_append.rs
  • sdk-libs/client/src/indexer/photon_indexer.rs
  • sdk-libs/client/src/indexer/types/queue.rs
  • sdk-libs/client/src/interface/load_accounts.rs
  • sdk-libs/program-test/src/indexer/test_indexer.rs

@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch 2 times, most recently from bfa6672 to e70763e Compare March 14, 2026 19:12
@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch from e70763e to 10ea034 Compare March 14, 2026 19:21
@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch from 19530a6 to a39fe5d Compare March 14, 2026 21:34
@sergeytimoshin sergeytimoshin force-pushed the sergey/load-accounts-batching branch from a39fe5d to ed44b77 Compare March 15, 2026 00:36
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
prover/client/src/helpers.rs (1)

49-54: ⚠️ Potential issue | 🟠 Major

Make path_index wide enough for address-tree indices.

The helper is now HEIGHT-generic, but path_index is still u32. The new address-append callers already hold indices as usize and have to downcast them with as u32; once an address tree passes u32::MAX, root reconstruction will wrap and produce the wrong path/changelog long before the tree is full.

Proposed fix
 pub fn compute_root_from_merkle_proof<const HEIGHT: usize>(
     leaf: [u8; 32],
     path_elements: &[[u8; 32]; HEIGHT],
-    path_index: u32,
+    path_index: usize,
 ) -> Result<([u8; 32], ChangelogEntry<HEIGHT>), ProverClientError> {
-    let mut changelog_entry = ChangelogEntry::default_with_index(path_index as usize);
+    let mut changelog_entry = ChangelogEntry::default_with_index(path_index);
 
     let mut current_hash = leaf;
     let mut current_index = path_index;

Then update the address-append call sites to pass low_element.index, new_low_element.index, and current_index directly, without lossy casts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/helpers.rs` around lines 49 - 54, The function
compute_root_from_merkle_proof currently takes path_index: u32 which can
overflow for large address trees; change its signature to accept path_index:
usize and update the internal use of
ChangelogEntry::default_with_index(path_index as usize) to pass the usize
directly (ChangelogEntry::default_with_index(path_index)), then update all
callers (the address-append sites that currently cast indices with as u32) to
pass their usize indices directly (e.g., low_element.index,
new_low_element.index, current_index) so there is no lossy cast or potential
wrap-around.
prover/client/tests/batch_address_append.rs (1)

87-109: 🧹 Nitpick | 🔵 Trivial

Add a non-empty changelog regression here.

This test still starts with empty changelog and indexed_changelog, so it only exercises the first-batch branch. The new proof-patching logic is only used once those vectors already contain prior entries, which is the riskier path introduced in this PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/tests/batch_address_append.rs` around lines 87 - 109, The test
currently initializes changelog and indexed_changelog as empty vectors so
get_batch_address_append_circuit_inputs only covers the first-batch path;
pre-populate both changelog: Vec<ChangelogEntry<{
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> and indexed_changelog:
Vec<IndexedChangelogEntry<usize, { DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize
}>> with one or more realistic entries (matching the ChangelogEntry and
IndexedChangelogEntry shapes used elsewhere in tests) before calling
get_batch_address_append_circuit_inputs so the function exercises the
proof-patching logic for non-empty prior entries; ensure the synthetic entries
use valid roots/indices/values consistent with the sparse_merkle_tree and
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT expectations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@forester/src/processor/v2/helpers.rs`:
- Around line 489-534: In get_batch_snapshot: reject truncated snapshots early
by changing the available check to return Ok(None) unless available >= end
(i.e., if self.wait_for_batch(end) < end -> Ok(None)) so we never construct
proofs for partial batches; before slicing validate that
data.low_element_values, low_element_next_values, low_element_indices,
low_element_next_indices and data.addresses have length >= actual_end and that
data.reconstruct_proofs::<HEIGHT>(start..actual_end) can be produced (otherwise
return an error like "incomplete batch data"); keep the existing
leaves_hash_chains fallback logic but do not proceed to build
AddressBatchSnapshot when any of these sibling/vector length checks fail.

In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 202-209: Before slicing new_element_values and allocating
per-element vectors, validate that new_element_values.len() >= zkp_batch_size
and that each corresponding input slice (the low_element_* and new_element_*
sources used later) has at least zkp_batch_size elements; if any check fails
return a ProverClientError instead of panicking. Move the check to just before
the line creating the slice (currently using let new_element_values =
&new_element_values[..zkp_batch_size];), only perform the slice after
validation, and use a single safe variable (e.g., batch_len = zkp_batch_size)
when creating capacities and indexing in subsequent loops so all
Vec::with_capacity(...) and later indexing use the validated length. Ensure the
error returned is descriptive (e.g., "truncated batch from indexer") and that
all low_element_* and new_element_* sources are validated the same way to avoid
any later panics.

In `@prover/client/src/prover.rs`:
- Around line 54-57: health_check currently builds a reqwest client with
no_proxy() while ProofClient::local(), ProofClient::with_config(), and
ProofClient::with_full_config() use Client::new(), causing divergent proxy
behavior and a silent ignore of builder errors; create a shared helper (e.g.,
build_http_client() -> Result<reqwest::Client, anyhow::Error>) that constructs
the client with the desired proxy policy (no_proxy() or configurable as
decided), return and propagate the Result so callers can log errors instead of
silently returning, then refactor health_check and all ProofClient constructors
to call this helper (replace direct Client::new() and the current builder call),
and ensure any builder Err is logged via the existing logger before returning an
error.

In `@sdk-libs/client/src/indexer/photon_indexer.rs`:
- Around line 1145-1154: The code currently only checks x.proof.len() <
STATE_MERKLE_TREE_CANOPY_DEPTH which allows truncated proofs; update the
validation before building the MerkleProof (around x.proof, proof_len logic) to
require the post-canopy proof length equals the expected sibling count (i.e.
expected_siblings = STATE_MERKLE_TREE_HEIGHT - STATE_MERKLE_TREE_CANOPY_DEPTH or
the known constant 16 for height 26/canopy 10). If x.proof.len() does not equal
STATE_MERKLE_TREE_CANOPY_DEPTH + expected_siblings (or proof_len !=
expected_siblings) return Err(IndexerError::InvalidParameters(...)) instead of
silently slicing and constructing the MerkleProof.

In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 128-137: The method reconstruct_proofs currently reserves capacity
using address_range.len() before checking that the requested range is within the
available proofs; validate the range bounds up front (before Vec::with_capacity)
by comparing address_range.start and address_range.end against the available
proof count (e.g., self.proof_count() or the relevant collection length on
self), and return an IndexerError for out-of-range requests; only then call
Vec::with_capacity and proceed to call build_node_lookup and
reconstruct_proof_with_lookup::<HEIGHT>.

In `@sdk-libs/client/src/interface/load_accounts.rs`:
- Around line 120-124: The failure index reported by MissingPdaCompressed is
non-unique because collect_pda_hashes(group) restarts enumerate() per batch;
change the code so the global cold-PDA offset is carried into each group (or
remove the index from MissingPdaCompressed). Concretely: modify
collect_pda_hashes to accept a start_index/offset (or create a new
collect_pda_hashes_with_offset) and when mapping pda_groups pass the accumulated
offset (e.g., outer_idx * MAX_PDAS_PER_IX or running total) so that the inner
enumerate adds that offset when constructing MissingPdaCompressed.index;
alternatively, remove the index field from MissingPdaCompressed to match
MissingPdaCompressedData for consistency.
- Around line 244-259: fetch_individual_proofs spawns all RPCs at once which can
overwhelm the indexer; replace the try_join_all approach with a bounded
concurrency stream so requests are limited but results keep original order. Use
futures::stream::iter over the hashes, map each hash to the async
indexer.get_validity_proof call, then apply StreamExt::buffered(CONCURRENCY) (or
a named constant) and collect the responses into a Vec to preserve positional
order; apply the same pattern to fetch_proof_batches (use the same CONCURRENCY
constant and the same stream->buffered->collect flow) so both functions limit
concurrent RPCs while keeping result ordering for subsequent zip operations.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 729-731: The response currently sets start_index to
address_tree_bundle.right_most_index(), which breaks pagination semantics
because the returned addresses slice is still paginated by
options.address_queue_start_index; revert start_index to reflect the pagination
offset (use options.address_queue_start_index) and add a new field (e.g.,
tree_next_insertion_index or insertion_index) to the response struct to carry
address_tree_bundle.right_most_index() as u64; update the response construction
where start_index and addresses are populated and adjust the response
type/struct (and any builders) to include the new field so callers can read the
sparse-tree next insertion index without overloading start_index.

---

Outside diff comments:
In `@prover/client/src/helpers.rs`:
- Around line 49-54: The function compute_root_from_merkle_proof currently takes
path_index: u32 which can overflow for large address trees; change its signature
to accept path_index: usize and update the internal use of
ChangelogEntry::default_with_index(path_index as usize) to pass the usize
directly (ChangelogEntry::default_with_index(path_index)), then update all
callers (the address-append sites that currently cast indices with as u32) to
pass their usize indices directly (e.g., low_element.index,
new_low_element.index, current_index) so there is no lossy cast or potential
wrap-around.

In `@prover/client/tests/batch_address_append.rs`:
- Around line 87-109: The test currently initializes changelog and
indexed_changelog as empty vectors so get_batch_address_append_circuit_inputs
only covers the first-batch path; pre-populate both changelog:
Vec<ChangelogEntry<{ DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> and
indexed_changelog: Vec<IndexedChangelogEntry<usize, {
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> with one or more realistic
entries (matching the ChangelogEntry and IndexedChangelogEntry shapes used
elsewhere in tests) before calling get_batch_address_append_circuit_inputs so
the function exercises the proof-patching logic for non-empty prior entries;
ensure the synthetic entries use valid roots/indices/values consistent with the
sparse_merkle_tree and DEFAULT_BATCH_ADDRESS_TREE_HEIGHT expectations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 276f6819-fcb6-48df-a7d4-de182c3af1f6

📥 Commits

Reviewing files that changed from the base of the PR and between 238cb7f and ed44b77.

⛔ Files ignored due to path filters (7)
  • Cargo.lock is excluded by !**/*.lock and included by none
  • forester-utils/src/address_staging_tree.rs is excluded by none and included by none
  • program-tests/utils/src/e2e_test_env.rs is excluded by none and included by none
  • program-tests/utils/src/mock_batched_forester.rs is excluded by none and included by none
  • program-tests/utils/src/test_batch_forester.rs is excluded by none and included by none
  • sparse-merkle-tree/src/indexed_changelog.rs is excluded by none and included by none
  • sparse-merkle-tree/tests/indexed_changelog.rs is excluded by none and included by none
📒 Files selected for processing (14)
  • forester/src/processor/v2/helpers.rs
  • forester/src/processor/v2/strategy/address.rs
  • prover/client/src/constants.rs
  • prover/client/src/errors.rs
  • prover/client/src/helpers.rs
  • prover/client/src/proof_types/batch_address_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_update/proof_inputs.rs
  • prover/client/src/prover.rs
  • prover/client/tests/batch_address_append.rs
  • sdk-libs/client/src/indexer/photon_indexer.rs
  • sdk-libs/client/src/indexer/types/queue.rs
  • sdk-libs/client/src/interface/load_accounts.rs
  • sdk-libs/program-test/src/indexer/test_indexer.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
prover/client/src/proof_types/batch_update/proof_inputs.rs (1)

139-178: ⚠️ Potential issue | 🟠 Major

Fail fast on mismatched batch-update inputs.

This loop only zips leaves, merkle_proofs, and path_indices, but still indexes tx_hashes[i]. A short tx_hashes vector, a short proof, or a hash failure still panics here even after making compute_root_from_merkle_proof() fallible.

Possible fix
+    let expected_len = batch_size as usize;
+    if tx_hashes.len() != expected_len
+        || leaves.len() != expected_len
+        || old_leaves.len() != expected_len
+        || merkle_proofs.len() != expected_len
+        || path_indices.len() != expected_len
+    {
+        return Err(ProverClientError::GenericError(format!(
+            "Input vector length mismatch: tx_hashes={}, leaves={}, old_leaves={}, merkle_proofs={}, path_indices={}, expected={}",
+            tx_hashes.len(),
+            leaves.len(),
+            old_leaves.len(),
+            merkle_proofs.len(),
+            path_indices.len(),
+            expected_len,
+        )));
+    }
+
-    for (i, (leaf, (mut merkle_proof, index))) in leaves
+    for (i, ((leaf, tx_hash), (mut merkle_proof, index))) in leaves
         .iter()
-        .zip(merkle_proofs.into_iter().zip(path_indices.iter()))
+        .zip(tx_hashes.iter())
+        .zip(merkle_proofs.into_iter().zip(path_indices.iter()))
         .enumerate()
     {
@@
-        let merkle_proof_array = merkle_proof.try_into().unwrap();
+        let proof_len = merkle_proof.len();
+        let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| {
+            ProverClientError::GenericError(format!(
+                "Invalid merkle proof length: got {}, expected {}",
+                proof_len, HEIGHT
+            ))
+        })?;
@@
-        let nullifier = Poseidon::hashv(&[leaf, &index_bytes, &tx_hashes[i]]).unwrap();
+        let nullifier = Poseidon::hashv(&[leaf, &index_bytes, tx_hash])?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_update/proof_inputs.rs` around lines 139
- 178, The loop can panic because tx_hashes is indexed by i while the zip only
covers leaves, merkle_proofs and path_indices; validate input lengths up front
or include tx_hashes in the zip to avoid out-of-bounds access and remove
unwraps: check that leaves.len() == merkle_proofs.len() == path_indices.len() ==
tx_hashes.len() and return a ProverClientError if not, replace
merkle_proof.try_into().unwrap() with a fallible conversion that maps the error
into ProverClientError, and ensure Poseidon::hashv and
compute_root_from_merkle_proof errors are propagated rather than causing panics.
prover/client/src/proof_types/batch_append/proof_inputs.rs (1)

135-194: ⚠️ Potential issue | 🟠 Major

Validate the batch-append inputs before building the circuit.

This loop truncates to the shortest of old_leaves, leaves, and merkle_proofs, so a mismatch can silently produce fewer append proofs than batch_size claims. merkle_proof.try_into().unwrap() also still turns a bad indexer response into a panic instead of a ProverClientError.

Possible fix
+    let expected_len = batch_size as usize;
+    if old_leaves.len() != expected_len
+        || leaves.len() != expected_len
+        || merkle_proofs.len() != expected_len
+    {
+        return Err(ProverClientError::GenericError(format!(
+            "Input vector length mismatch: old_leaves={}, leaves={}, merkle_proofs={}, expected={}",
+            old_leaves.len(),
+            leaves.len(),
+            merkle_proofs.len(),
+            expected_len,
+        )));
+    }
+
     for (i, (old_leaf, (new_leaf, mut merkle_proof))) in old_leaves
         .iter()
         .zip(leaves.iter().zip(merkle_proofs.into_iter()))
         .enumerate()
     {
@@
-        let merkle_proof_array = merkle_proof.try_into().unwrap();
+        let proof_len = merkle_proof.len();
+        let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| {
+            ProverClientError::GenericError(format!(
+                "Invalid merkle proof length: got {}, expected {}",
+                proof_len, HEIGHT
+            ))
+        })?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/proof_types/batch_append/proof_inputs.rs` around lines 135
- 194, The loop silently truncates to the shortest iterator (old_leaves, leaves,
merkle_proofs) and panics on invalid proof conversion; validate input lengths
and convert safely before building the circuit: check that old_leaves.len(),
leaves.len(), and merkle_proofs.len() equal batch_size (and/or start_index +
batch_size bounds) and return a ProverClientError on mismatch; replace
merkle_proof.try_into().unwrap() with a fallible conversion that maps the error
into ProverClientError (maintain context like current_index and batch_position);
keep remaining logic (is_old_leaf_zero, compute_root_from_merkle_proof)
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@forester/src/processor/v2/helpers.rs`:
- Around line 489-494: get_batch_snapshot currently accepts independent
caller-controlled cursors (start..end and hashchain_idx); enforce batch
alignment by computing the batch index from start using self.zkp_batch_size
(e.g., batch_idx = start / self.zkp_batch_size), require end == start +
self.zkp_batch_size, and derive the hash-chain index from that batch index
instead of trusting the caller-provided hashchain_idx (or validate that the
passed hashchain_idx equals the derived value and return an error if not).
Update get_batch_snapshot to validate these invariants and use the derived
hash-chain index when loading the hash chain; apply the same change to the
analogous helper at the other location referenced in the comment.

In `@prover/client/src/prover.rs`:
- Around line 29-31: The health_check_once function incorrectly returns true as
soon as prover_listener_present() is true; instead, remove the early return and
treat the listener as only a hint—always perform the actual HTTP readiness check
against the prover endpoint (the same logic used in the later HTTP probe block)
before returning true. Update health_check_once to, when
prover_listener_present() is true, attempt the HTTP GET health probe (or
call/reuse the existing HTTP check routine), only returning true on a successful
HTTP response; ensure spawn_prover() still starts the process if the HTTP probe
fails despite a listening port.
- Around line 136-188: The IS_LOADING flag can remain true if the awaited
startup block panics or calls expect!, so ensure IS_LOADING.store(false,
Ordering::Release) runs on every exit path of the startup sequence: wrap the
async startup block currently assigned to spawn_result in a panic::catch_unwind
(or use a scope guard/defer) so you can clear IS_LOADING in a finally-like
section after awaiting the block, then re-propagate any panic or return the
block's result; reference the IS_LOADING static, the spawn_result async block,
and the code paths that call panic! / expect! (Command::spawn, health_check
loop, child.try_wait) to place the cleanup.

In `@sdk-libs/client/src/indexer/photon_indexer.rs`:
- Around line 1693-1695: The assignment is wrong: do not set
tree_next_insertion_index to aq.start_index; use the queue's stored tree
insertion position instead so the staging tree aligns with the real next leaf.
Update the struct initialization where start_index: aq.start_index,
tree_next_insertion_index: aq.start_index, root_seq: aq.root_seq to set
tree_next_insertion_index from the queue's actual next-insertion field (e.g.,
aq.tree_next_insertion_index or the correct field that records the tree's next
leaf) so AddressStagingTree::new(..., start_index) and downstream alignment
checks use the true tree insertion index rather than the slice start.

---

Outside diff comments:
In `@prover/client/src/proof_types/batch_append/proof_inputs.rs`:
- Around line 135-194: The loop silently truncates to the shortest iterator
(old_leaves, leaves, merkle_proofs) and panics on invalid proof conversion;
validate input lengths and convert safely before building the circuit: check
that old_leaves.len(), leaves.len(), and merkle_proofs.len() equal batch_size
(and/or start_index + batch_size bounds) and return a ProverClientError on
mismatch; replace merkle_proof.try_into().unwrap() with a fallible conversion
that maps the error into ProverClientError (maintain context like current_index
and batch_position); keep remaining logic (is_old_leaf_zero,
compute_root_from_merkle_proof) unchanged.

In `@prover/client/src/proof_types/batch_update/proof_inputs.rs`:
- Around line 139-178: The loop can panic because tx_hashes is indexed by i
while the zip only covers leaves, merkle_proofs and path_indices; validate input
lengths up front or include tx_hashes in the zip to avoid out-of-bounds access
and remove unwraps: check that leaves.len() == merkle_proofs.len() ==
path_indices.len() == tx_hashes.len() and return a ProverClientError if not,
replace merkle_proof.try_into().unwrap() with a fallible conversion that maps
the error into ProverClientError, and ensure Poseidon::hashv and
compute_root_from_merkle_proof errors are propagated rather than causing panics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2250d3ec-f597-48a1-97ac-13a6819486f3

📥 Commits

Reviewing files that changed from the base of the PR and between ed44b77 and f9a788c.

⛔ Files ignored due to path filters (3)
  • program-tests/utils/src/e2e_test_env.rs is excluded by none and included by none
  • program-tests/utils/src/mock_batched_forester.rs is excluded by none and included by none
  • program-tests/utils/src/test_batch_forester.rs is excluded by none and included by none
📒 Files selected for processing (18)
  • forester/src/processor/v2/helpers.rs
  • forester/src/processor/v2/processor.rs
  • forester/src/processor/v2/proof_worker.rs
  • forester/src/processor/v2/strategy/address.rs
  • prover/client/src/helpers.rs
  • prover/client/src/proof_client.rs
  • prover/client/src/proof_types/batch_address_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_append/proof_inputs.rs
  • prover/client/src/proof_types/batch_update/proof_inputs.rs
  • prover/client/src/prover.rs
  • prover/client/tests/batch_address_append.rs
  • sdk-libs/client/src/indexer/photon_indexer.rs
  • sdk-libs/client/src/indexer/types/queue.rs
  • sdk-libs/client/src/interface/load_accounts.rs
  • sdk-libs/program-test/src/indexer/test_indexer.rs
  • sdk-libs/sdk-types/src/interface/program/decompression/pda.rs
  • sdk-libs/sdk-types/src/interface/program/decompression/processor.rs
  • sdk-libs/sdk-types/src/interface/program/decompression/token.rs

Comment on lines +489 to +494
pub fn get_batch_snapshot<const HEIGHT: usize>(
&self,
start: usize,
end: usize,
hashchain_idx: usize,
) -> crate::Result<Option<AddressBatchSnapshot<HEIGHT>>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bind the hash-chain cursor to the requested batch.

get_batch_snapshot still trusts two independent caller-controlled cursors: start..end and hashchain_idx. That lets a caller build a non-batch-sized snapshot or accidentally pair batch N's addresses/proofs with batch M's hash chain, which only fails later during proof generation/verification. This helper already knows self.zkp_batch_size, so it should enforce one aligned batch here and derive or validate the hash-chain index locally.

🩹 Suggested guardrails
     pub fn get_batch_snapshot<const HEIGHT: usize>(
         &self,
         start: usize,
         end: usize,
         hashchain_idx: usize,
     ) -> crate::Result<Option<AddressBatchSnapshot<HEIGHT>>> {
+        if start >= end {
+            return Ok(None);
+        }
+
+        let batch_len = end - start;
+        if batch_len != self.zkp_batch_size || start % self.zkp_batch_size != 0 {
+            return Err(anyhow!(
+                "invalid address batch range {}..{} for zkp_batch_size {}",
+                start,
+                end,
+                self.zkp_batch_size
+            ));
+        }
+
+        let expected_hashchain_idx = start / self.zkp_batch_size;
+        if hashchain_idx != expected_hashchain_idx {
+            return Err(anyhow!(
+                "hashchain_idx {} does not match batch range {}..{} (expected {})",
+                hashchain_idx,
+                start,
+                end,
+                expected_hashchain_idx
+            ));
+        }
+
         let available = self.wait_for_batch(end);
-        if available < end || start >= end {
+        if available < end {
             return Ok(None);
         }

Based on learnings: ZKP batch sizes for address trees must be 10 or 250, and tree updates must require zero-knowledge proofs with public inputs: old root, new root, hash chain, and next_index.

Also applies to: 524-541

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@forester/src/processor/v2/helpers.rs` around lines 489 - 494,
get_batch_snapshot currently accepts independent caller-controlled cursors
(start..end and hashchain_idx); enforce batch alignment by computing the batch
index from start using self.zkp_batch_size (e.g., batch_idx = start /
self.zkp_batch_size), require end == start + self.zkp_batch_size, and derive the
hash-chain index from that batch index instead of trusting the caller-provided
hashchain_idx (or validate that the passed hashchain_idx equals the derived
value and return an error if not). Update get_batch_snapshot to validate these
invariants and use the derived hash-chain index when loading the hash chain;
apply the same change to the analogous helper at the other location referenced
in the comment.

Comment on lines +136 to +188
if IS_LOADING
.compare_exchange(false, true, Ordering::AcqRel, Ordering::Acquire)
.is_err()
{
if health_check(120, 1).await {
return;
}
panic!("Failed to start prover, health check failed.");
}

let spawn_result = async {
let mut command = Command::new(prover_path);
command.arg("start-prover").stdout(Stdio::piped()).stderr(Stdio::piped());
let mut child = command.spawn().expect("Failed to start prover process");
let mut child_exit_status = None;

let command = Command::new(prover_path)
.arg("start-prover")
.spawn()
.expect("Failed to start prover process");
for _ in 0..120 {
if health_check(1, 1).await {
info!("Prover started successfully");
return;
}

let _ = command.wait_with_output();
if child_exit_status.is_none() {
match child.try_wait() {
Ok(Some(status)) => {
tracing::warn!(
?status,
"prover launcher exited before health check succeeded; continuing to poll for detached prover"
);
child_exit_status = Some(status);
}
Ok(None) => {}
Err(error) => {
tracing::error!(?error, "failed to poll prover child process");
}
}
}

let health_result = health_check(120, 1).await;
if health_result {
info!("Prover started successfully");
} else {
panic!("Failed to start prover, health check failed.");
sleep(Duration::from_secs(1)).await;
}

if let Some(status) = child_exit_status {
panic!(
"Failed to start prover, health check failed after launcher exited with status {status}."
);
}

panic!("Failed to start prover, health check failed.");
}
.await;

IS_LOADING.store(false, Ordering::Release);
spawn_result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clear IS_LOADING on every exit path.

The flag is reset only after the awaited startup block completes normally. Any panic! or expect() inside that block unwinds past Line 187 and leaves IS_LOADING stuck true, so later callers can no longer retry startup in the same process.

Possible fix
+        struct LoadingGuard;
+        impl Drop for LoadingGuard {
+            fn drop(&mut self) {
+                IS_LOADING.store(false, Ordering::Release);
+            }
+        }
+
+        let _loading_guard = LoadingGuard;
         let spawn_result = async {
             let mut command = Command::new(prover_path);
             command.arg("start-prover").stdout(Stdio::piped()).stderr(Stdio::piped());
@@
         }
         .await;
-
-        IS_LOADING.store(false, Ordering::Release);
         spawn_result
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/prover.rs` around lines 136 - 188, The IS_LOADING flag can
remain true if the awaited startup block panics or calls expect!, so ensure
IS_LOADING.store(false, Ordering::Release) runs on every exit path of the
startup sequence: wrap the async startup block currently assigned to
spawn_result in a panic::catch_unwind (or use a scope guard/defer) so you can
clear IS_LOADING in a finally-like section after awaiting the block, then
re-propagate any panic or return the block's result; reference the IS_LOADING
static, the spawn_result async block, and the code paths that call panic! /
expect! (Command::spawn, health_check loop, child.try_wait) to place the
cleanup.

Comment on lines 1693 to 1695
start_index: aq.start_index,
tree_next_insertion_index: aq.start_index,
root_seq: aq.root_seq,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C4 'tree_next_insertion_index|queue_next_index|start_index' forester sdk-libs program-libs --glob '*.rs'

Repository: Lightprotocol/light-protocol

Length of output: 50387


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the photon_indexer.rs file and extract the struct definition
find . -name "photon_indexer.rs" -type f | head -5

Repository: Lightprotocol/light-protocol

Length of output: 119


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look at the context around lines 1693-1695
sed -n '1680,1710p' sdk-libs/client/src/indexer/photon_indexer.rs

# Also find where this struct/field is defined
echo "=== LOOKING FOR STRUCT DEFINITION ==="
rg "struct.*AddressQueue\|start_index:" sdk-libs/client/src/indexer/photon_indexer.rs -A 5 | head -40

Repository: Lightprotocol/light-protocol

Length of output: 1388


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the AddressQueue type/struct definition in the response parsing
rg "struct.*AddressQueue|type.*AddressQueue" sdk-libs/client/src/indexer/photon_indexer.rs -B 2 -A 10 | head -60

Repository: Lightprotocol/light-protocol

Length of output: 54


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look at more context around the response parsing to find the serde deserialization
sed -n '1600,1750p' sdk-libs/client/src/indexer/photon_indexer.rs | head -200

Repository: Lightprotocol/light-protocol

Length of output: 6656


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the Photon API response types to understand what start_index represents
find . -type f -name "*.rs" | xargs rg "photon_api.*AddressQueue|api_response.*address_queue" -l | head -5

# Also search for where these types come from
rg "use.*photon_api" sdk-libs/client/src/indexer/photon_indexer.rs | head -10

Repository: Lightprotocol/light-protocol

Length of output: 171


🏁 Script executed:

#!/bin/bash
set -euo zipfail

# Search for photon_api crate definition
find . -type f -name "photon_api*" -o -name "*photon*" | grep -E "Cargo.toml|lib.rs" | head -10

# Also check if photon_api is external
rg "photon-api|photon_api" Cargo.toml -l | head -5

Repository: Lightprotocol/light-protocol

Length of output: 124


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for photon_api crate definition
find . -type f -name "Cargo.toml" | xargs grep -l "photon" | head -10

# Check the Photon docs to understand what start_index is
find . -type d -name "*photon*" | head -10

Repository: Lightprotocol/light-protocol

Length of output: 289


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look at the photon-api crate structure
ls -la sdk-libs/photon-api/

# Find type definitions for address queue
rg "struct.*AddressQueue|start_index" sdk-libs/photon-api/ -A 3 | head -80

Repository: Lightprotocol/light-protocol

Length of output: 5363


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the external photon directory for documentation
ls -la external/photon/ | head -30

# Look for any comments or documentation about start_index
rg "start_index" external/photon/ -B 3 -A 3 | head -100

Repository: Lightprotocol/light-protocol

Length of output: 178


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the docs mentioned in learnings
cat program-libs/batched-merkle-tree/docs/INSERT_ADDRESS_QUEUE.md

Repository: Lightprotocol/light-protocol

Length of output: 5337


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the update docs to understand tree_next_insertion_index
cat program-libs/batched-merkle-tree/docs/UPDATE_FROM_ADDRESS_QUEUE.md | head -150

Repository: Lightprotocol/light-protocol

Length of output: 170


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find docs files
find program-libs -name "*.md" -type f | head -20

# Also check for the CLAUDE.md file
cat program-libs/batched-merkle-tree/docs/CLAUDE.md

Repository: Lightprotocol/light-protocol

Length of output: 10407


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look at how streaming_queue is used - what does tree_next_insertion_index() return?
sed -n '580,600p' forester/src/processor/v2/helpers.rs

# Check the structure of StreamingAddressQueue
rg "impl StreamingAddressQueue" forester/src/processor/v2/helpers.rs -A 50 | head -80

Repository: Lightprotocol/light-protocol

Length of output: 2617


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find where streaming_queue data is populated from photon response
rg "StreamingAddressQueue\|tree_next_insertion_index" forester/src/processor/v2/helpers.rs -B 5 -A 5 | head -100

Repository: Lightprotocol/light-protocol

Length of output: 54


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look at where the streaming queue wraps the photon response data
rg "StreamingAddressQueue::new|tree_next_insertion_index:" forester/src/processor/v2/ -B 3 -A 3 | head -100

Repository: Lightprotocol/light-protocol

Length of output: 54


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Let me check test_indexer to see how it distinguishes start_index from tree_next_insertion_index
sed -n '720,735p' sdk-libs/program-test/src/indexer/test_indexer.rs

Repository: Lightprotocol/light-protocol

Length of output: 833


Line 1694 incorrectly assigns aq.start_index to tree_next_insertion_index—these are separate concerns.

The test indexer (lines 729–730) and forester's AddressStagingTree initialization confirm the distinction:

  • start_index: where the indexed queue slice begins
  • tree_next_insertion_index: the tree's actual next insertion position (the leaf index the staging tree should initialize to)

When forester calls AddressStagingTree::new(..., start_index) at line 183 of forester/src/processor/v2/strategy/address.rs, it uses this value to track which leaves have already been processed. If tree_next_insertion_index is set to the queue's slice start instead of the tree's true next leaf position, alignment checks will fail and batches will be skipped or processed from the wrong position.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/client/src/indexer/photon_indexer.rs` around lines 1693 - 1695, The
assignment is wrong: do not set tree_next_insertion_index to aq.start_index; use
the queue's stored tree insertion position instead so the staging tree aligns
with the real next leaf. Update the struct initialization where start_index:
aq.start_index, tree_next_insertion_index: aq.start_index, root_seq: aq.root_seq
to set tree_next_insertion_index from the queue's actual next-insertion field
(e.g., aq.tree_next_insertion_index or the correct field that records the tree's
next leaf) so AddressStagingTree::new(..., start_index) and downstream alignment
checks use the true tree insertion index rather than the slice start.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sdk-libs/sdk-types/src/interface/account/token_seeds.rs (1)

268-280: ⚠️ Potential issue | 🟠 Major

Don’t silently ignore output_queue_index; enforce consistency or remove it end-to-end.

Line 268 makes output_queue_index unused, while Line 280 now sources queue_pubkey_index only from tree_info. Since callers still pass output_queue_index (see sdk-libs/sdk-types/src/interface/program/decompression/token.rs), mismatches become silent and can produce incorrect merkle context assembly.

Proposed guard to prevent silent divergence
     fn into_in_token_data(
         &self,
         tree_info: &PackedStateTreeInfo,
-        _output_queue_index: u8,
+        output_queue_index: u8,
     ) -> Result<MultiInputTokenDataWithContext, LightSdkTypesError> {
+        if output_queue_index != tree_info.queue_pubkey_index {
+            return Err(LightSdkTypesError::InvalidInstructionData);
+        }
         Ok(MultiInputTokenDataWithContext {
             amount: self.token_data.amount,
             mint: self.token_data.mint,
             owner: self.token_data.owner,
             version: self.token_data.version,
             has_delegate: self.token_data.has_delegate,
             delegate: self.token_data.delegate,
             root_index: tree_info.root_index,
             merkle_context: PackedMerkleContext {
                 merkle_tree_pubkey_index: tree_info.merkle_tree_pubkey_index,
-                queue_pubkey_index: tree_info.queue_pubkey_index,
+                queue_pubkey_index: output_queue_index,
                 leaf_index: tree_info.leaf_index,
                 prove_by_index: tree_info.prove_by_index,
             },
         })
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/sdk-types/src/interface/account/token_seeds.rs` around lines 268 -
280, The function currently accepts output_queue_index but ignores it when
building PackedMerkleContext (using tree_info.queue_pubkey_index), which can
cause silent mismatches; update the constructor logic in the function that
returns MultiInputTokenDataWithContext to either (a) validate that
output_queue_index == tree_info.queue_pubkey_index and return an Err on
mismatch, or (b) use the provided output_queue_index when setting
PackedMerkleContext.queue_pubkey_index (and document the precedence), and ensure
callers in token.rs remain compatible; refer to symbols output_queue_index,
tree_info, PackedMerkleContext, and MultiInputTokenDataWithContext to locate and
fix the code path.
♻️ Duplicate comments (2)
prover/client/src/prover.rs (2)

42-44: ⚠️ Potential issue | 🟠 Major

Don’t treat a listening port as healthy service.

Line 42-Line 44 still returns healthy before an HTTP probe. A stale or unrelated listener can bypass startup and break downstream calls.

Suggested fix
 fn health_check_once(timeout: Duration) -> bool {
-    if prover_listener_present() {
-        return true;
-    }
+    let listener_present = prover_listener_present();
@@
     let mut stream = match TcpStream::connect_timeout(&addr, timeout) {
         Ok(stream) => stream,
         Err(error) => {
-            tracing::debug!(?error, endpoint, "prover health TCP connect failed");
+            tracing::debug!(?error, endpoint, listener_present, "prover health TCP connect failed");
             return health_check_once_with_curl(timeout);
         }
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/prover.rs` around lines 42 - 44, The health check currently
returns true if prover_listener_present() returns true, which lets any open
listener mark the service healthy; remove this early return and instead perform
the actual readiness probe (e.g., send an HTTP GET to the expected
health/readiness endpoint or call the real service initialization check) before
returning true. Update the function that contains prover_listener_present() (the
health-check/ready-check function) to only consider prover_listener_present() as
a preliminary hint and only return healthy after the HTTP probe (or the concrete
readiness function such as check_http_probe(), perform_readiness_probe(), or the
existing HTTP client call) succeeds; ensure proper error propagation/logging
when the probe fails.

152-167: ⚠️ Potential issue | 🟠 Major

Reset IS_LOADING even when startup panics.

Line 166 is not reached on panic paths inside the startup block, so IS_LOADING can remain true and deadlock subsequent startup attempts.

Suggested fix
+        struct LoadingGuard;
+        impl Drop for LoadingGuard {
+            fn drop(&mut self) {
+                IS_LOADING.store(false, Ordering::Release);
+            }
+        }
+        let _loading_guard = LoadingGuard;
+
         let spawn_result = async {
             Command::new(&prover_path)
                 .arg("start-prover")
                 .spawn()
                 .unwrap_or_else(|error| panic!("Failed to start prover process: {error}"));
@@
         }
         .await;
-
-        IS_LOADING.store(false, Ordering::Release);
         spawn_result
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/prover.rs` around lines 152 - 167, The IS_LOADING flag can
remain true if the async startup block panics, so create an RAII guard (e.g., a
small LoadingGuard struct) that calls IS_LOADING.store(false, Ordering::Release)
in its Drop impl, instantiate it before running the async startup block (the
block that uses Command::new(&prover_path), health_check, and assigns
spawn_result), and remove the explicit IS_LOADING.store(false,
Ordering::Release) at the end; this ensures IS_LOADING is cleared regardless of
panics during the startup sequence.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@prover/client/src/prover.rs`:
- Around line 173-184: The health_check function currently calls the blocking
helper health_check_once directly inside the async runtime; change it to offload
blocking work by calling tokio::task::spawn_blocking(||
health_check_once(timeout)) and await the JoinHandle (e.g., let ok =
tokio::task::spawn_blocking(move ||
health_check_once(timeout)).await.unwrap_or(false)); use that boolean result in
the existing logic (return true if ok), ensuring health_check_once remains sync
and any blocking I/O runs inside spawn_blocking so the Tokio worker thread is
not blocked.

In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 2577-2621: The parse/read error handling currently uses `?` inside
the success branch (specifically the
`response_result.text().await.map_err(|error| { ... })?` and
`deserialize_gnark_proof_json(&body).map_err(|error| ...) ?` calls), which
short-circuits out of the retry loop; replace those `?` usages by handling
errors locally: call `response_result.text().await` and if it Err, set
`last_error` to a descriptive message (similar to the existing error branches)
and continue the retry loop instead of returning; likewise call
`deserialize_gnark_proof_json(&body)` and on Err set `last_error` (including a
body-preview if helpful) and continue; only when both read and deserialize
succeed proceed to `proof_from_json_struct`, `compress_proof`, and return
`Ok(ValidityProofWithContext { ... })`; keep using the existing `last_error` and
`retries` logic so network/parse failures trigger retries instead of immediate
failure.

---

Outside diff comments:
In `@sdk-libs/sdk-types/src/interface/account/token_seeds.rs`:
- Around line 268-280: The function currently accepts output_queue_index but
ignores it when building PackedMerkleContext (using
tree_info.queue_pubkey_index), which can cause silent mismatches; update the
constructor logic in the function that returns MultiInputTokenDataWithContext to
either (a) validate that output_queue_index == tree_info.queue_pubkey_index and
return an Err on mismatch, or (b) use the provided output_queue_index when
setting PackedMerkleContext.queue_pubkey_index (and document the precedence),
and ensure callers in token.rs remain compatible; refer to symbols
output_queue_index, tree_info, PackedMerkleContext, and
MultiInputTokenDataWithContext to locate and fix the code path.

---

Duplicate comments:
In `@prover/client/src/prover.rs`:
- Around line 42-44: The health check currently returns true if
prover_listener_present() returns true, which lets any open listener mark the
service healthy; remove this early return and instead perform the actual
readiness probe (e.g., send an HTTP GET to the expected health/readiness
endpoint or call the real service initialization check) before returning true.
Update the function that contains prover_listener_present() (the
health-check/ready-check function) to only consider prover_listener_present() as
a preliminary hint and only return healthy after the HTTP probe (or the concrete
readiness function such as check_http_probe(), perform_readiness_probe(), or the
existing HTTP client call) succeeds; ensure proper error propagation/logging
when the probe fails.
- Around line 152-167: The IS_LOADING flag can remain true if the async startup
block panics, so create an RAII guard (e.g., a small LoadingGuard struct) that
calls IS_LOADING.store(false, Ordering::Release) in its Drop impl, instantiate
it before running the async startup block (the block that uses
Command::new(&prover_path), health_check, and assigns spawn_result), and remove
the explicit IS_LOADING.store(false, Ordering::Release) at the end; this ensures
IS_LOADING is cleared regardless of panics during the startup sequence.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 12323e3c-a7c5-473c-847a-f66e3b6762e8

📥 Commits

Reviewing files that changed from the base of the PR and between f9a788c and 7ab1c90.

⛔ Files ignored due to path filters (1)
  • sdk-tests/csdk-anchor-full-derived-test/tests/basic_test.rs is excluded by none and included by none
📒 Files selected for processing (6)
  • prover/client/src/errors.rs
  • prover/client/src/prover.rs
  • sdk-libs/client/src/interface/instructions.rs
  • sdk-libs/program-test/src/indexer/test_indexer.rs
  • sdk-libs/sdk-types/src/interface/account/token_seeds.rs
  • sdk-libs/sdk-types/src/interface/program/decompression/pda.rs

Comment on lines 173 to 184
pub async fn health_check(retries: usize, timeout: usize) -> bool {
let client = reqwest::Client::new();
let mut result = false;
for _ in 0..retries {
match client
.get(format!("{}{}", SERVER_ADDRESS, HEALTH_CHECK))
.send()
.await
{
Ok(_) => {
result = true;
break;
}
Err(_) => {
sleep(Duration::from_secs(timeout as u64));
}
let timeout = Duration::from_secs(timeout as u64);
let retry_delay = timeout;

for attempt in 0..retries {
if health_check_once(timeout) {
return true;
}

if attempt + 1 < retries {
sleep(retry_delay).await;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify blocking std I/O and process calls are executed from async health-check path.
set -euo pipefail

rg -nP --type=rust '\bpub\s+async\s+fn\s+health_check\b|health_check_once\(' prover/client/src/prover.rs -C3
rg -nP --type=rust 'TcpStream::connect_timeout|\.read\(|Command::new\("lsof"\)|Command::new\("curl"\)' prover/client/src/prover.rs -C2

Repository: Lightprotocol/light-protocol

Length of output: 1319


🏁 Script executed:

# Check for existing spawn_blocking usage patterns in the codebase
rg -n 'spawn_blocking' --type=rust | head -20

Repository: Lightprotocol/light-protocol

Length of output: 637


🏁 Script executed:

# Check if CLAUDE.md exists and contains async/blocking guidelines
find . -name 'CLAUDE.md' -o -name 'claude.md' | xargs cat 2>/dev/null || echo "CLAUDE.md not found"

Repository: Lightprotocol/light-protocol

Length of output: 50386


🏁 Script executed:

# Find all callers of health_check to understand the context
rg -n '\bhealth_check\(' --type=rust -B2 -A2 | grep -v 'fn health_check'

Repository: Lightprotocol/light-protocol

Length of output: 1234


🏁 Script executed:

# Verify the exact lines and confirm the diff context
sed -n '173,190p' prover/client/src/prover.rs

Repository: Lightprotocol/light-protocol

Length of output: 436


🏁 Script executed:

# Check if health_check_once is awaitable or blocking
rg -n 'fn health_check_once' prover/client/src/prover.rs -A1

Repository: Lightprotocol/light-protocol

Length of output: 294


Move blocking health probes off the Tokio worker thread.

The health_check_once call at line 178 performs blocking socket and process I/O (TcpStream::connect_timeout, stream.read, Command::new) directly within an async function. This blocks runtime workers and can degrade other async tasks waiting on the executor.

Wrap the blocking work in tokio::task::spawn_blocking:

Suggested fix
 pub async fn health_check(retries: usize, timeout: usize) -> bool {
     let timeout = Duration::from_secs(timeout as u64);
     let retry_delay = timeout;
 
     for attempt in 0..retries {
-        if health_check_once(timeout) {
+        let check_timeout = timeout;
+        let ok = tokio::task::spawn_blocking(move || health_check_once(check_timeout))
+            .await
+            .unwrap_or(false);
+        if ok {
             return true;
         }
 
         if attempt + 1 < retries {
             sleep(retry_delay).await;
         }
     }
+
+    false
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub async fn health_check(retries: usize, timeout: usize) -> bool {
let client = reqwest::Client::new();
let mut result = false;
for _ in 0..retries {
match client
.get(format!("{}{}", SERVER_ADDRESS, HEALTH_CHECK))
.send()
.await
{
Ok(_) => {
result = true;
break;
}
Err(_) => {
sleep(Duration::from_secs(timeout as u64));
}
let timeout = Duration::from_secs(timeout as u64);
let retry_delay = timeout;
for attempt in 0..retries {
if health_check_once(timeout) {
return true;
}
if attempt + 1 < retries {
sleep(retry_delay).await;
}
pub async fn health_check(retries: usize, timeout: usize) -> bool {
let timeout = Duration::from_secs(timeout as u64);
let retry_delay = timeout;
for attempt in 0..retries {
let check_timeout = timeout;
let ok = tokio::task::spawn_blocking(move || health_check_once(check_timeout))
.await
.unwrap_or(false);
if ok {
return true;
}
if attempt + 1 < retries {
sleep(retry_delay).await;
}
}
false
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prover/client/src/prover.rs` around lines 173 - 184, The health_check
function currently calls the blocking helper health_check_once directly inside
the async runtime; change it to offload blocking work by calling
tokio::task::spawn_blocking(|| health_check_once(timeout)) and await the
JoinHandle (e.g., let ok = tokio::task::spawn_blocking(move ||
health_check_once(timeout)).await.unwrap_or(false)); use that boolean result in
the existing logic (return true if ok), ensuring health_check_once remains sync
and any blocking I/O runs inside spawn_blocking so the Tokio worker thread is
not blocked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant